A robust Kantorovich's theorem on inexact Newton method with 

relative residual error tolerance 

O. p. Ferreira* B. F. Svaiter^ 

November 12, 2010 

Abstract 

We prove that under semi-local assumptions, the inexact Newton method with a fixed rel- 
ative residual error tolerance converges Q-linearly to a zero of the non-linear operator under 
consideration. Using this result we show that Newton method for minimizing a self-concordant 
function or to find a zero of an analytic function can be implemented with a fixed relative 
residual error tolerance. 

In the absence of errors, our analysis retrieve the classical Kantorovich Theorem on Newton 
method. 
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Introduction 

Newton's method and its variations, including the inexact Newton methods, are the most efficient 
methods known for solving nonlinear equations 

F{x) = 0, 

where X and Y are Banach spaces, C C X and F : C — t- Y is continuous and continuously 
differentiable on int(C). 

Kantorovich's Theorem on Newton's method uses semi-local assumption on F to guarantee 
existence of a solution of the above equation, uniqueness of this solution in a prescribed region and 
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also convergence of Newton's Method to such a solution, see [8l[9]. Semi- local convergence theorems 
for Newton method has been instrumental in the modern complexity analysis of the solution of 
polynomial (or analytical) equations [H [18] , linear and quadratic programming problems and linear 
semi-definite programming problems [151 I16j . These convergence results has also been used in 
the design and convergence analysis of algorithms for these problems. In all these applications, 
homotopy methods are combined with Newton's method, which helps the algorithm to keep track 
of the solution of a parametrized perturbed version of the original problem. Each Newton iteration 
requires the solution of a linear system, and this accounts mostly for the computational burden of 
these algorithms. 

Since linear system are solved always inexactly in floating point computations, it is natural to 
investigate robustness of Kantorovich's and Kantorovich's-like theorems under errors in the compu- 
tation of the Newton step. Moreover, modern implementations of the conjugate gradients, coupled 
with preconditioning, allows for the approximated solution of large linear systems. It would be 
most desirable to have an a priori prescribed residual error tolerance in the iterative solutions 
of linear system for computing the Newton steps, because this would prevent over-solving and/or 
under-solving the linear system in question. Although the local convergence analysis of Newton's 
method with relative errors in the residue [Sj |4l [I4j or in the steep [22] are well understood, the 
convergence analysis of the method under general semi-local assumptions assuming only bounded 
relative residual errors is a new contribution of this paper. Previous works on this subject in- 
clude |13[ [17] . The advantage of working with an error tolerance on the residual rests in the fact 
that the exact Newton step need not to be know for evaluating this error, which makes this criterion 
attractive for practical applications. 

Recently, Kantorovich's theorem on Newton's Method was extended to Riemannian manifolds 
using a new technique which simplifies the analyses and proof of this theorem, see [5]. After that, 
this technique was successfully employed for proving generalized versions of Kantorovich's theorem 
in Riemannian Manifolds and also in the analysis of the classical version of Kantorovich's theorem 
in Banach spaces, see[ll [H [lOl [Til HI 113 ISQl [H]- In the present work, we will use the technique 
introduced in [5] to present a robust version of the Kantorovich's theorem on the inexact Newton 
method with residual relative error. It is worth to point out that, for null error tolerance the analysis 
presented merge in the usual semi-local convergence analysis on Newton's method, see p]. The 
basic idea is to find good regions for the inexact Newton method. In these regions, the majorant 
function bounds the non-linear function which root is to be found, and the behavior of the inexact 
Newton iteration in these regions is estimated using iterations associated to the majorant function. 
Moreover, as a whole, the union of all these regions is invariant under inexact Newton's iteration. 

This paper is organized as follows. In Section [H some definitions and auxiliary results are 
presented. In Section [2] the main result is stated and some properties of the majorant function 
are established. The main relationships between the majorant function and the nonlinear operator 
used in the paper are presented in Section [3j In Section [3] a family of regions where the behavior 
of the inexact Newton iteration is estimated using the majorant function is introduced. We also 
show that the union of all these regions is invariant under the inexact Newton iteration with a fixed 
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relative residual error tolerance.. In Section [5] the main result is proved. In Section [6] we show that 
Newton method for minimizing a self-concordant function under the usual semi-local assumption for 
these functions, can be implemented with a fixed residual error tolerance. Moreover, we show that 
Newton method for finding a zero of an analytic function, under the usual semi-local assumption 
of the a-theory can be also be implemented with a fixed relative residual error tolerance. 

1 Basics definitions and auxiliary results 

Let X be a Banach space. The open and closed ball at x are denoted, respectively by 

B{x,r) = {y £ X; ||x — y|| < r}, B[x,r] = {y £ X; ||x — y|| ^ r}. 
The following auxiliary results of elementary convex analysis will be needed: 
Proposition 1.1. Let I cM be an interval, and (/?:/—)• M be convex. 
1. For any uq G int(/), the application 

Uq — U 

is increasing and there exist (inE.) 

D ^{uo) = hm„^ - = sup„<. 



"0 Uq-U - ^^uo uq-U 

2. If u,v,w €z I , u < w, and u < v < w then 

V — u 



ip{v) - ip{u) < [lp{w) - Lp{u)] 



w — u 



Proof. See [7]. □ 
Proposition 1.2. If h : [a,b) — t- M is convex, differentiable at a, h'{a) < and 



then 



lim h{t) = 0, 

t-)-6_ 



h(a) , 
h'{a] 



with equality if and only if h is affine in [a, b) . 
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Proof. Since h is convex, h{a) + h'{a){t — a) < h{t) for any t G [a, 6). Taking the limit t — )■ 6_ we 
obtain 

h{a) + h'{a){h-a) < 0. 

The desired incquahty now follows multiplying this inequality by the strictly positive number 
—l/h'{a). If the above inequality holds as an equality, then 

, -h{a) 

h (a) = . 

— a 

Let a < s < t < b. Using again the convexity of h we have 

h{a) + h'{a){s - a) < h{s) < h{a)^—^ + h{t)^^^. 

"t/ Gi t CI 

Taking again the limit t — > 6_ in the above equation and using the previous equation we conclude 
that h{s) = h{a){b—s)/{b—a), i.e., /i is affine. If /i is affine then the the inequality of the proposition 
holds trivially as an equality. □ 

2 The inexact Newton method with relative error 

Our goal is to prove the following version of Kantorovich's theorem on inexact Newton's Method 
with relative error. 

Theorem 2.1. Let X and Y be Banach spaces, i? G M, C C X and F : C ^ Y a continuous func- 
tion, continuously differentiable on int{C). Take xq G int(C) with F'{xo) non-singular. Suppose 
that f : [0, -R) — 7- M is continuously differentiable, B{xq,R) C C, 

\\F\xo)-' [F'{y) - F'{x)] \\ < f'{\\y - x\\ + \\x - xo\\) - f'{\\x - xo\\), (2.1) 

for any x,y E B{xo, R), \\x — a;o|| + \\y — x\\ < R, 

\\F'{xo)-'F{xo)\\<f{0), (2.2) 

hi) /(O) > 0, /'(O) = -1; 
h2) /' is strictly increasing and convex; 
h3) f{t) <0 for some t G (0, R). 
Let 

P:= sup -fit), t,:=mmf-\{0}), f:=sup{tG[0,R) : /(t) < 0}. 
te[o,R) 
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Take < p < (3/2 and define 
<f{t) + 2p) 



sup 



P<t<R\f'{p)\{t-p) 



\p := sup{t G [p, R):Kp + f'{t) < 0}, Bp := 



(2.3) 



Then for any 9 G [0, Qp] and zq £ B{xo,p), the sequence generated by the inexact Newton method 
for solving F{x) = with starting point zq and residual relative error tolerance 9: For /c = 0, 1, . . . , 

zfc+i = zk + Sk, \\F'{zq)-^ [F{zk) + F'{zk)Sk\ II < 9\\F'{zo)-^F{zk)l 
is well defined (for any particular choice of each Sk), 

32 \ 



\F'{zo)-^F{zk)\\ < 



1 + ' 



[/(0) + 2p], 



k = 0,l,... , 



(2.4) 



the sequence {z^} is contained in B{zQ,Xp) and converges to a point x^ G B[xQ,t^], which is the 
unique zero of F on B{xo,f). Moreover, if 

h4) Xp < R - p, 

then the sequence {zk} satisfies, for k = 0,1, . . . , 

■l + 9D-f'{\p).. 



< 



Zk\\+' 



/'(A, + p) + 2|/'(p)| 



2 |/'(\)| ' " \f'{\+p)\ 

If, additionally, < 9 < Kp/{4: + Kp) then {z/^} converges Q-linearly as follows 

1 

— + : 

V- 



Zk . 



< 



29 

2 '^V 



/c = 0,1,... . 



Remark 2.2. In Theorem \2.1\ if 9 = we obtain the exact Newton method and its convergence 
properties. Now, taking 9 = 9k in each iteration and letting 9k goes to zero as k goes to infinity, 
the penultimate inequality of the Theorem \2.1\ implies that the generated sequence converges to the 
solution with superlinear rate. 

^From now on, we assume that the hypotheses of Theorem 12.11 hold. The scalar function / 
in the above theorem is called a majorant function for F at point xq. Before proceeding, we will 
analyze some basic properties of the majorant function. Condition h2 implies in strict convexity 
of /. Note that is the smallest root of f{t) = and, since / is convex, if this equation has two 
roots, then the second one is r. 

Define 

t := sup{t G [0, R) : f'{t) < 0} . (2.5) 



Proposition 2.3. The following statements on the majorant function hold 
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i) f'{t) < for any t G [0, t), (and f'{t) > for any t G [0, R) \ [0, t) ); 

a) < < t < f < iJ; 

Hi) ^ = -limt^t_ fit), 0</3<t. 

Proof. Item i follows from the second part of hi, h2 and the definition (|2.5p . 

Using the first inequality in hi, h3 and the continuity of / we conclude that t^, is well defined 

and 

0<t^ <R. 

Condition h2 implies in strict convexity of /, hence condition h3 and the definition of imply 
that there exists t G (t* , R) such that 

> fit) > fit,) + f'it,)it - t,) = f'it,)it - i,), 

which implies that > fit,). Therefore, using item i and the definition of i we have 

t,<t< R. 

Since t, is the smallest root of fit) = and / is strictly decreasing in [0, t) we conclude that / < 
in [t,,t). So, the definition of f implies that 

t<f<R, 

and the proof of item ii is concluded. 

Using h3 and the definition of /3 we obtain that < /3. Since / is convex, combining this with 
hi we have 

fit) > /(O) -t>-t, 0<t<R, 

with strict inequality for t ^ 0. We know that / is strictly decreasing and f < in [t,,t). Hence, 
letting t goes to t_ in last inequality and using the definition of /3 the item iii follows. □ 

We will first prove Theorem 12. II for the case p = and zq = xq. In order to simplify the notation 
in the case p = 0, we will use k, A and 6 instead of kq, Aq and Oq respectively: 

k:= sup Zli^^ A :=sup{t G [0,i?) : K + < 0}, G := — (2.6) 

0<t<R t 2 — K 

Proposition 2.4. For k, A, 9 as in (j2.6p it holds that 

< K < 1, < e < 1, t,<X<t<f, (2.7) 

and 

f'it) + K<0, VtG[0,A), 
inf fit) + Kt= lim fit) + Kt = 0, ^^'^^ 

0<t<R t-i.A_ 
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Proof. Since / is convex, combining this with hi we have 



fit) > /(O) -t>-t, 0<t<R, 
with strict inequahty for t ^ 0. For t ^ 0, last inequahty is equivalent to 

-/W<i_M<i_M<i. o<t<R. 



t ~ t R 

and, using also h3, we conclude that 

o<K<i, o<e<i, 

where the bounds on G follows from its definition and the bound on k. Moreover, as /' is continuous, 
strictly increasing and /'(O) = —1, we obtain 

0<A, f'{t) + K<0, VtG[0,A), 

inf f(t) + Kt= lim f(t) + = 0, 

0<t<R t->X- 

where the last equalities follows from the definition of k. 

Note that f'{t) = —k, < for all t S [0, A). Since /' is strictly negative in [0, A), we conclude 
that tj, < \ < t < f and the proof is concluded. □ 



3 Basic results 

In this section we will obtain bounds on and on the linearization error on F using the 

majorant function /. This bounds will be used in the next section for analyzing the inexact 
Newton iterations. It is worth mentioning that in this section the inequality on hi and ()2.2p will 
be used only for proving its last result and h3 will not be used. 

A Newton iteration at x requires non-singularity of F'{x), which will be verified using the 
majorant function /. 

Proposition 3.1. // ||a; — xqH <t<t then F'{x) is non-singular and 

\\F'{xr'F'{xo)\\<-^y 

Proof. The definition (j2.5p shows that f'{t) < 0. Direct manipulation, (j2.ip . hi and h2 give us 

\\F'ixo)-'F'ix) - I\\ = \\F'{xo)-'[F'ix) - F'ixo)]\\ < f'iWx - xo\\) - /'(O) 

= fit) + 1< 1. 
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Using Banach's Lemma and the last inequality above we conclude that F'{xq) ^F'{x) is non- 
singular and 

\\F'{x)-'F'{xo)\\ = \\{F'{xo)-'F'{x))-'\\ < ^ _ ^^/^^^ ^ , 

which is the desired inequality. □ 

The linearization errors on F and / are, respectively 

EFiy,x):=F{y)-[F{x)+F'{x){y-x)], x£B{xo,R), y£C (3.1) 

ef{v,t):=f{v)-[f{t) + f'{t){v-t)], t,vG[0,R). (3.2) 

The linearization error of the majorant function bounded the linearization error of F. 

Lemma 3.2. // x, y € X and \\x — xq\\ + \\y — x\\ < R then 

\\F'{xo)~'^EFiy,x)\\ < e/(||x - xo|| + \\y - x\\, \\x - xq\\) , 

Proof. Since 

X + u{y - x) e B{xo,R), < n < 1, 
and F is continuously differentiable in B{xo,R), direct use of (j3.ip gives 

EF{y,x)= [ [F'{x + u{y - x)) - F'{x)]{y - x) du. 



Combining the above equality with (12. ip we have 
\\F'{xo)-'EF{y,x)\\ 

< / \\F'{xo)-^[F'{x + u{y-x))-F' 
Jo 

< / [f {\\x - xqW + u\\y - x\\) - f'{\\x-xo\\)]\\y-x\\du 



which after performing the integration and using the definition in (j3.2p yields the desired inequality. 

□ 

Convexity of / and /' guarantee that ef{t + s,t) is increasing in s and t. 
Lemma 3.3. IfO<b<t,0<a<s and t + s < R then 

ef{a + b,b) < e/(t + s,t), 

e/(a + 6, 6)<- a, s 0. 

Zi s 



Proof. First note that 

ef{a + b,b)= r[f'{b + r)-f'ib)]dr. 
Jo 

Since /' is convex, for any tq > 0, the function r i— /'(r + tq) — /'(t) is non-decreasing. So, 

e/(a + 6, 6) < /" [/' (t + r) - /' (t)] dr < f [f (t + r) - /' (t)] dr . (3.3) 

JO JO 

where the second inequahty follows from the convexity of /, which implies positivity of the in- 
tegrand. To end the proof of first inequality, note that the last term on the above inequality is 
e/(t + s,t). 

For proving second inequality, apply Proposition 11.11 item 2 with u = t, v = t + r,w = t + s 
and if = f \n first inequality in (|3.3p to conclude that 

e/(a + 6,6) < P [fit + s) - /'{t)]- dr, 
Jo ^ 

which performing the integral gives the desired inequality. □ 

Now we are ready to bound the linearization error Ep using the linearization error on the 
majorant function. 

Corollary 3.4. If x,y € X, ||x — xo|| < t, \\y — x|| < s and s + t < R then 
\\F'{xo)-'EF{y,x)\\ <ef{t + s,t), 

\\F'ixor'Ep{y,x)\\ < + _ , ^ 0. 

2 s 

Proof. The results follows by direct combination of the Lemmas 13.21 and 13.31 bv taking b = ||x — xo|| 
and a = \\y — x\\. □ 

The first inequality in the next corollary will be useful for obtaining asymptotic bounds on the 
sequence generated by the inexact Newton method with relative error tolerance, while the second 
inequality will be used to show that this method is robust with respect to the initial iterate. 

Corollary 3.5. For any y € B{xq,R), 

-f{\\y - xoll) < \\F'{x^)-^F{y)\\ < f{\\y - xo\\) + 2\\y- xo\\ . 
Proof. Using Lemma 13.21 with x = xq, the definition of Ep and triangle inequality we have 

e/(||y-xo||,0) > \\F'{xo)-^EF{y,xo)\\ 

> \\F'{xo)-^F{xo) + y-xo\\- \\F' (xq)-^ F{y)\\ 

> \\y - xoll - \\F'{xor'F{xo)\\ - \\F'{xor'F{y)\\ . 
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Combining this inequality with the definition of ej and using the assumption hi and (|2.2p we 
obtain 

f{\\y - xoW) - /(O) + \\y - xoW > \\y - xo\\ - /(O) - \\F'{xo)-'F{y)\\ , 

which is equivalent to the firs inequality of the corollary. 

To prove the second inequality, use again Lemma [3.2l the definition of Ep and triangle inequality 
to obtain 

e/(||y - xoll , 0) > ||F'(a;o)-^^F(2/, xo)|| 

> \\F'{xo)-^F{y)\\ - \\F'{xoy'F{xo) + y- xo\\ 

> \\F'{xo)-^F{y)\\ - \\F'{xo)-^F{xo)\\ - \\y - xo\\ . 

Using the above inequality, the definition of ej, hi and (j2.2p we have 

f{\\y - xoll) - /(O) + \\y - xoW > ||F'(xo)-'F(y)|| - /(O) - \\y - xo|| . 
which is equivalent to the second inequality of the corollary. □ 

Note that the first inequality on the above corollary proves that F has no zeroes in the region 
t* < ||x — Xoll < t. 

Lemma 3.6. //x G X, \\x — xo\\ < t < R then 

\\F'{xo)-'F'{x)\\<2 + f'{t). 
Proof. Simple algebraic manipulation together with assumption (12. ip give us 

\\F'{xor'F'{x)\\ < I + F\x,)-\F\x) - F'(xo)] < 1 + /'(||x - xo||) - /'(O). 
Hence, hi, h2 and the last inequality imply the statement of the lemma. □ 

Lemma 3.7. Take 9 > 0, < t < X, x^,x,y £ X. If X < R, \\x — xo|| < t, ||x* — x|| < X — t, 
F(x*) = and 

\\F'{xo)-^[F{x) + F'{x){y - x)]|| < e\\F' {x^)-^ F {x)l (3.4) 

then 



\x* - y\\ < 
\x* - y\\ < 



1 + 



2£ 

2 ^~ 



I I ? 



l + eD-f'{X) .2 + /'(A) 



l/'(A)| 



l/'(A)| 



•2' * X\ 



(3.5) 
(3.6) 
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Proof. Since -F(x*) = 0, direct algebraic manipulation and ()3.ip yield 

y-x, = F'{x)-^ [Ef{x^, x) + [F{x) + F'{x){y - 



Using p.4p . properties of the norm and some simple manipulations we conclude from last equality 
that 

-y|| < \\F\x)-^F'{xq)\\ [\\F'{xQ)-^EF{x,,x)\\+d\\F{xQ)-^F{x)\\\ . 
On the other hand, using again F{x^,) = and the definition in ()3.ip we have 

-F'{xq)-^F{x) = F'(xo)~' [Ef{x,,x) + F'{x){x, - x)] , 

which using the triangular inequality yields 

\\F'{xo)-^F{x)\\ < \\F'{xq)-'^Ef{x^,x)\\ + \\F'{xo)-^F'{x)\\ \\x^ - x\\ . 

Combining two above inequalities with Proposition 13.11 Corollary 13.41 with y = x.^ and s = \ — t 
and Lemma 13.61 we have 



1 

\x* - y\\ < 



\f'{t)\ 



'-±^^^^\\x.-x\\^0l2^fi 



Since — x\\ < X — t, f < —k < in [0, A) and /' is increasing the first inequality follows from 
last inequality. 

Using Proposition 11.11 and taking in account that /' < in [0, A) and increasing we obtain the 
second inequality from above inequality. □ 

4 The inexact Newton iteration with relative error 

In the next lemma we study a single inexact Newton iteration with relative error 6. 
Lemma 4.1. Take t,e,6 > 0, and x € C such that 

||x-xo|| <t <t, \\F'{xo)-^F{x)\\<f{t) + e, t - (1 + 0)M±£ < i?. (4.1) 

If y £ X and 

\\F'{xo)-^[F{x) + F'{x){y - x)]\\ < e\\F'{xor^F{x)\\. (4.2) 

then 

Jit) + e 



1. \\y — x\\ < — (1 



fit) ' 



2. \\y-xo\\<t-{l + e)l^jl±^<R; 
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3. \\F'{x,)-^F{y)\\ <f[t-(l + 0)^^jl^^ + e + 29{f{t) + e). 

Proof. Using Proposition 13.11 and the first inequality in ()4.ip we conclude that F'{x) is non-singular 
and ||F'(2;)~^F'(xo)|| < —l/f'{t). Therefore, using also the identity 



y-x = F'{x)-^F'{xo) 



F'ixo)-^ [F{x) + F'{x){y - x)] - F\xo)-^F{x) 



triangular inequality and (|4.2p we conclude that 



-1 



\\y-x\\<jj^{l + e)\\F'{xor'F{x)\\ . 

To end the proof of item 1, use the above inequality and the second inequality on (|4.ip . 

Item 2 follows from triangular inequality, item 1 and the first and the third inequalities in (j4.1 
Using the definition of the error (|3.ip we have 



F{y) = Ep{y,x) + F'{xo) 



F'ixo)-HFix) + F'{x)iy-x)] 



Therefore, using the triangle inequality, (j4.2p and the second inequality on ()4.ip we have 

\\F'{xor'F{y)\\ <\\F\xor'EF{y,x)\\ + 0\\F'{xor'F{x)\\ 
<\\F'{xor^EF{y,x)\\ + 0{f{t) + e). 

Using item 1, and Lemma O with s = -{I + e){f{t) + e)/f'{t) we have 



\\F'{xor'Ep{y,x)\\ <ef [t - {I + e)l^^,t^ 

=f (i - (1 + ^^^J7^) + ' + ^(/(*) + 

Direct combination of the two above equation yields the latter inequality in item 3. 
In view of Lemma |4 . 1 1 define . for 9 >0, the auxiliary map ng : [0,t) x [0,cxd) — t- I 

ne{t, e) := (t-{l + 0)1%^, e + 2e(/(t) + e] 



Let 



f'{t) 



n := {{t, e) gMxM : <t < X, < e < Kt, 0< f{t) + e} . 



□ 

(4.3) 
(4.4) 
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Lemma 4.2. IfO<6<Q, (t,e) G and (t+,e+) = ng{t,e), that is, 

i+ = t - (1 + e+ = e + 2e{f{t) + e) 

then nQ{t,e) £ fl, t < e < e+ and 

f[t+)+e+<(^^) {f{t) + e). 



Proof. Since < t < X, according to (j2.6p we have f'{t) < —k < 0. Therefore t < t+ and e < e^. 
Ase< Kt, f{t) + e > and -1 < f'{t) < f'{t) + k < 0, 



/(t)+£ ^ f{t) + Kt 
fit) - fit) 

f{t) + Kt 



1+ " 



f'{t)+K 

The function := /(s) + is differentiable at t, h'{t) < 0, is strictly convex and 

Hm h{t) = 0. 

Therefore, using Proposition II. 21 we have t — h{t)/h'{t) < A, which is equivalent to 

Combining the above inequality with ()4.5p and the definition of t+ we conclude that 

t+ <t + {1 + 9){1- K){X-t). 

Using p.6p and ()2.7p we have (1 + 0)(1 — k) < 1 — < 1, which combined with the above inequality 
yields t+ < A. 

Using the definition of e+, inequality e < nt and (I2.6p we obtain 

e+ < 2e{f{t) + e) + Kt 

= K{t + {l + e){f{t) + e)). 

Using again the inequalities f{t) + e > and —1 < f'{t) < we have 

Combining the two above inequalities with the definition of t+ we obtain e-|_ < Kt^. 
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For proving the two last inequalities, first note that from the definition of the linearization error 
in ()3.2p we have 

/(t+) + e+ = fit) + f{t){U -t)+ ef{t+,t) + 2e{f{t) + e) + e 
= e{f{t) + e)+ef{t+,t) 

= e{f{t) + E) + j\nu)- nt))du. 

Since /' is strictly increasing we conclude that the integral is positive. So, last equality implies 
that + £+> 0{f{t) + e) > 0. Taking s £ A) and using the convexity of /' we have 

(/'(n) - fit)) du < / {f{s) - fit)) du 

Jt s — t 



2 s-t 

Substituting last inequality into above equation we have 



(/'(«) - fit)) 



fiU) + e+ < eifit) + e) + if'is) - fit)) 

i {i + er fit) +e f is)- rm 

+ 2 is-t) -fit) -fit) )^^^'^ + ')- 
On the other hand, because /'(s) + /t < and — 1 < fit) it easy to conclude that 

f is) -fit) fis) + K-fit)-K 



-fit) -fit) 



< 1 - K. 



Combining last two above inequalities with ()4.5p . (j4.6p and taking in account that (1 + 9)il — k) < 
1 — 6 we conclude that 

/(«+) +£+<(«+ ^(1 + 9)2(1 - 4'^) {fit) + 

and the result follows taking the limit s — )■ A_. □ 

The outcome of an inexact Newton iteration is any point satisfying some error tolerance. Hence, 
instead of a mapping for Newton iteration, we shall deal with a family of mappings, describing all 
possible inexact iterations. 
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Definition 4.3. For < 0, Mq is the family of maps Nq : B{xo,t) — )• X such that 

\\F\xo)-'[F{x) + F'{x){Ne{x) - x)]\\ < \\F'ixo)-'F{x)\\ , (4.7) 
for each x G B{xo,t). 

If X G i?(xo,t), then F'[x) is non-singular. Therefore, for = 0, the family A/q has a single 
element, namely the exact Newton iteration map 

Nq : B{xo,t) ^ X, x^ No{x) = x- F'{x)~^F{x). 

Trivially, if < 6* < 0' then Mq C Afg C Mq' ■ Hence, Me is non-empty for all 6* > 0. 

Remark 4.4. For any £ (0, 1) and Nq £ Me 

Ne{x) = X <^ F{x) = 0, x£ B{xo,t). 

This means that the fixed point of the inexact Newton iteration Nq are the same fixed points of the 
exact Newton iteration, namely, the zeros of F. 

The main tool for the analysis of the inexact Newton method with a relative residual toler- 
ance will be a family of sets described below and analyzed in the ensuing proposition, which is a 
combination of Lemmas 14.11 and 14.21 Define 

K{t,e) := {x £ X : \\x-xo\\<t, ||F'(xo)~^F(x)|| < /(t) + e} , (4.8) 

and 

K:= U K{t,e). (4.9) 

{t,s)en 

Recall that uq, Cl and Mq were defined in (]4.3p . (14. 4j) and Definition 14.31 respectively. 
Proposition 4.5. Take < < @ and Nq G Mq. Then for any {t,e) G and x G K{t,e) 

NQ{K{t, e)) C K{ne{t, e)) C K, \\Nq{x) - x\\ <t+-t, 
where t+ is the first component of nQ(t,e). Moreover, 

nQ{n)cn, Ng{K)cK. (4.10) 
Proof Combine definitions (gS]), (jl31), Definition (jM]), (gj]) with Lemmas O and |121 □ 
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5 Convergence analysis 



Theorem 5.1. Take < 6 < @ and Ng G Mq. For any (toj^o) G ^ o-nd yo G K{tQ,eo) the 
sequences 

yk+i = Ng{yk), {tk+i,ek+i) = ng{tk,ek), k = 0,l,..., (5.1) 

are well defined, 

Vk K{tk,ek), {tk,ek)£n A; = 0,1,..., (5.2) 

the sequence {tk} is strictly increasing and converges to some t £ (0,A], the sequence {£k} is 
non- decreasing and converges to some e G [0, kA], 

g2\ 



\F'ixo)-^F{yk)\\<fitk) + ek < 



1 + 



{f{to) + eo), k = 0,l, 



(5.3) 



the sequence {yk} is contained in B(xo,X) and converges to a point Xt, £ B[xo,t^] which is the 
unique zero of F in B{xo,f) and 



WUk+l — UkW < tk+1 — tk, 

Moreover, if 
h4') X<R, 

then the sequence {yk} satisfies, for k = 0,1, 

-i+oD-nx) 

\x* - yk+i\\ < 



VkW <i-tk, k = 0,l,... 



(5.4) 



VkW + 



/'(A) 



2 |/'(A)| " |/'(A)| 
If, additionally, < 6 < k/(4 + k.) then {yk} converges Q-linearly as follows 



X* - yk\ 



yfc+ill < 



i + e 29 

2 ^ ~ 



\X* UkWi 



A; = 0,1,... 



(5.5) 



(5.6) 



Proof. Well definition of the sequences {{tk,£k)} and {yk} as defined in ()5.ip follows from the 
assumptions on 6, {tQ,eo), yo and the two last inclusions on Proposition 14.51 Moreover, since (|5.2p 
holds for k = 0, using the first inclusion in Proposition 14.51 and induction on k, we conclude that 
()5.2p holds for all k. The first inequality in (j5.4p now follows from Proposition 14.51 ()5.2p and ()5.ip 
while the first inequality in (15. 3p follows from (15. 2p and the definition of K(t,e) in (]4.8p . 
Direct inspection of the definition of in (j4.4p shows that 

n C [0,A) X [0,kX). 

Therefore, using (15. 2p and the definition of K{t,e) we have 

tfcG[0,A), efcG[0,KA), yk£B{xQ,X), A; = 0,1,... 



16 



Using (j4.4p and Lemma [4.21 we conclude that {t^} is strictly increasing, {£k} is non-decreasing and 
the second equality in (I5.3p holds for all k. Therefore, in view of the first two above inclusions, 
{tk} and {sk} converge, respectively, to some i £ (0, A] and e G [0, kA]. Convergence to i, together 
with the first inequality in (I5.4p and the inclusion yk G ^(2:0, A) implies that yk converges to some 
X* G B[0, A] and that the second inequality on ()5.4p holds for all k. 

Using the inclusion yk G B{xo,X), the first inequality in Corollary 13.51 and (15. Sp we have 



According to ()2.8p . /' < —k in [0,A). Therefore, since /(t*) = and < A, 

fit) < -K{t-t^), U<t<X. 

Hence, if \\yk — xo\\ > t^, we can combine the two above inequalities, setting f = — xo|| in the 
second, to obtain 



Note that the above inequality remain valid even if \\yk — xo\\ < t^,. Therefore, taking the limit 
A; — )• c« in the above inequality we conclude that Hx* — xo|| < i*. Moreover, now that we know that 
Xit is in the interior of the domain of F, we can also take the limit /c — t- 00 in (j5.3p to conclude that 



The "classical" version of Kantorovich's theorem on Newton's method for a generic majorant 
function (see e.g. [6]) guarantee that under the assumptions of Theorem 1 2. H F has a unique zero 
in B{xo,f). Hence must be this zero of F. 

To prove (j5.5p and (j5.6p , first note that from first inclusion in (j5.2p we have \\yk — xo\\ < tk, for 
all A; = 0, 1, . . . . Now, since t G (0, A] we obtain from second inequality in ()5.4p that \\x^: — y^ll ^ 
A — tk, for all /c = 0, 1, . . . . Therefore, using h4', -F(x*) = and and first equality in (j5.ip . 
the desire inequalities follows by applying Lemma 13. 7i For concluding the proof, note that for 
< 6 < k/(4 + Av) the quantity in the bracket in (15. 6[) is less than one, which implies that the 
sequence {y^} converges Q-linearly. □ 



Proof. Assumption p < /3/2 and Proposition 12.31 item iii proves the first two inequalities of the 
proposition. The last inequality follows from the first inequality and Proposition 12.31 item i. □ 

Proof of Theorem \2.1\ First we will prove Theorem 12.11 with p = and zq = xq. Note that, from 
the definition in ()2.6p . we have 





= 0. 



Proposition 5.2. If < p < P/2 then 



p<t/2< t, 



f\p) < 0. 



Ko = K, Ao = A, Go 



e. 
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Since 

(0,0) eJl, xoeK{o,o), 

using Theorem 15.11 we conclude that Theorem 12.11 holds for p = 0. 
For proving the general case, take 

0<p</3/2, zoGB[xo,p]. (5.7) 

Using Proposition 15.21 and (j2.5p we conclude that p < i/2 and f'{p) < 0. Define 

g:[0,R-p)^R, g{t) = j^[f{t + p) + 2p]. (5.8) 

We claim that 5 is a majorant function for F at point zq. Trivially, B{zq, R — p) C C, g'{0) = —1, 
g{0) > 0. Moreover g' is also convex and strictly increasing. To end the proof that g satisfies hi, 
h2 and h3, using Proposition 12.31 item iii and second inequality in (|5.7|) we have 

hm 5(t) = _L(2p-/3)<0. 



Using Proposition 13. II we have 



\F'{zo)-'F'{xo)\\<j^y (5.f 



Therefore, using also the second inequality of Corollary 13.51 we have 

\\F'izor'F{zo)\\<\\F'{zo)-'F'{xo)\\ \\F'{xo)-'F{zo)\\ 
<j^[fi\\zo-xo\\) + 2\\zo-xo\\]. 

As /' > — 1, the function t 1— )• /(t) + 2t is (strictly) increasing. Combining this fact with the above 
inequality and (j5.8p we conclude that 



\\F\zor'F'{zo)\\<g{0). 
To end the proof that g is a majorant function for F at zq, take x, y £ X such that 

x,y e B{zo, R- p), \\x - zoW + \\y - x\\ < R - p . 
Hence x,y £ B{xo, R), \\x — xo|| + \\y — x\\ < R and using ()5.9p together with (12. ip we have 



\F'(zor' [F'{y) - F'{x)] \\ <\\F'{zo)-'F'{xo)\\\\F'{xor' [F'{y) - F'{x)] 

-1 
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Since /' is convex, the function t i— t- f'{s + t) — f'{s) is increasing for s > and ||x — xq\\ < 
\\x - ZoW + \\zo - xoW < ||x - Zoll + P, 

f{\\y-x\\ + \\x-Xo\\)-f'{\\x-Xo\\)< 

f'iWy - x\\ + \\x -zo\\+p)- f'{\\x - zoll + p)- 
Combining the two above inequahties with the definition of g we obtain 

WF'izo)-' [F'iy) - F'ix)] \\ < g'i\\y - x\\ + ||x - zo\\) - g'i\\x - zo\\). 
Note that for Kp, Ap and 0p as defined in ()2.3p . we have 



Kp= sup — Xp = sup{t e [0,R- p) : Kp + g'{t) < 0}, Qp= J^'' , 
0<t<B-p I ^ — /^p 

which are the same as ()2.3p with g instead of /. Therefore, applying Theorem 12.11 for F and the 
major ant function g at point zq and p = 0, we conclude that the sequence {z^} is well defined, 
remains in B{zQ,Xp), satisfies (12. 4p and converges to some G B[zo,t^:^p] which is a zero of F, 
where t^^p is the smallest solution of g{t) = 0. Using (jS.Sp we conclude that t*^p is the smallest 
solution of 

f{p + t) + 2p = 0. 

Hence, in view of Proposition 12.31 item ii, we have p + f*_p < t < f. and i?[(zO)i*,p] C B{xo,f). 
Therefore, z* is the unique zero of F in B{xo, f), which we already called Since 

g'{t) = fit + p)/\f'{p)\, D-g'{t) = D-f{t + p)/\f'{p)\, te[0,R- p), 

applying again Theorem 12.11 for F and the majorant function g at point zq and p = 0, we conclude 
that item h4 also holds. □ 



6 Special cases 

First we use Theorem 12.11 to analyze the convergence of the inexact Newton method with a relative 
residual error tolerance in the setting of Smale's a-theory. Up to our knowledge, this is the first 
time an inexact Newton method with a relative error tolerance is analyzed in this framework. 

Theorem 6.1. Let X and Y be Banach spaces, C C X and F : C — )• Y a continuous function and 
analytic int{C). Take xq G int(C) with F'(xo) non-singular. Define 



7 := sup 

n>l 



F'(xo)"^F(")| 



Xq) 



ni 



l/(n-l) 
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Suppose that B{xq, I/7) C C, 6 > and that 



\F'{xo)~^F{xo)\\ <b, b-f <3-2V2, 0<9< 



1 - 7& 



Then, the sequence generated by the inexact Newton method for solving F{x) = with starting 
point xq and residual relative error tolerance 9: For = 0, 1, . . . , 

Xk+i = xk + Sk, \\F'{xo)-^ [F{xk) + F'{xk)Sk\ \\ < e\\F'{xo)-^F{xk)\\, 

is well defined, the generated sequence {xk} converges to a point x^ which is a zero of F, 

32\ 



F'{xo)-^F{xk)\\ < 



1 + 



k = 0,l,... , 



the sequence {x^} is contained in B[xq, A), x^, £ B[xq, t^] and is the unique zero of F in B{xq, t), 
where ^ 

'jb + jb^ 



_ 1 + 76 - v^l - 676 + (76)2 _ _ 1 + 75 + ^1 - 676 + (76)2 



Moreover, the sequence {x^} satisfies, for A: = 0, 1, ... , 

-i + eD-f'{\) 



|X* - Xk+l\\ < 



3^* X/c ~1~ 



/'(A) + 2 



2 |/'(A)| " |/'(A)| 
If, additionally, < < (1 — 2^/75 — 7&)/(5 — 2^/76 — 76) then {x^} converges Q-linearly as follows 



1 + 



+ 
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2 ■ 1-21/^-76 
Proof. Since the function / : [0, 1/7) — )• M 

fit) * 



l-jt 



I X* Xf^ 1 1 , 



2t + b, 



k = 0,l,... . 



is a majorant function for F in xq, [6]. Therefore, all results follow from Theorem 12.11 applied to 
this particular context. □ 

A semi-local convergence result for Newton method is instrumental in the complexity analysis 
of linear and quadratic minimization problems by means of self-concordant functions |15) . Also 
in this setting, Theorem 12.11 provides a semi-local convergence result for Newton method with a 
relative error tolerance. 
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Theorem 6.2. Let C C M" he an open convex set and let g : C ^ M be an a- self- concordant 
function with parameter a > 0. For x £ C, let 

\\v\\^ ■= ^vTg"{x)v, vGM'^, 

Wr{x) := {z : \\z — x\\^ < r}, Wr[x] := {z : \\z — x\\^ < r}. 
Suppose that xq G C, g"{xo) is non-singular, b > 

l-2Vb-b 



\g"{xo)-^g'{xo)\\^, < 6 < 3 - 2^/2, 0<e< 



l + 2\/6 + 6 



Then the sequence generated by the inexact Newton method for solving g'{x) = with starting point 
Xq and residual relative error tolerance 9: For k = 0,1, . . . , 

Xk+i = Xk + Sk, \\g"{xoy'^ [g'ixk) + g"{xk)Sk]\\^^ < 6\\g"ixo)'''^g'{xk)\\xo, 
is well defined, converges to a point which is the (unique, global) minimizer of g, 

32N k 



g"{xo)-'g'{xk)U < 



1 + 



A; = 0,1, 



the sequence {xk} is contained in W\{xo) and G Wt^{x{)), where 



A 



1+ 6 - Vl - 6 6 + 62 



Vb + b' 

Moreover, the sequence {x/^} satisfies, for A; = 0, 1 

-i + eD-f'ix) 



X ^ X}-^ 1 1 ~|~ 



/'(A) + 2 



2 |/'(A)| |/'(A)| 
//, additionally, < 6 < {1 - 2-sfb - h)/{b - 2y/b - b) then {xk] converges Q-linearly as follows 



\x^ ^Ar+lll — 



1 + 1 



+ 
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2 1-2^6-6 
Proof. The scalar function / : [0, 1) — t- M defined by 

t 



fit) 



1 - t 



I "-^A; 1 1 ) ^ — ^ ; ) ' 



2t + b, 



is a majorant function for g' m. xq , [6]. Therefore, the proof follows from Theorem 12.11 applied to 
this particular context. □ 
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Theorem 6.3. Let X and Y be a Banach spaces, C C X and F : C ^ Y a continuous function, 
continuously differentiahle on int{C). Take xq £ int(C) with F'{xq) non-singular. Suppose that 
exist constants L > and b > such that bL < 1/2, B(xq, 1/L) C C and 

\\F'{xo)-^ [F'{y) - F'{x)] \\ < L\\x - y||, x, y £ B{xo, 1/L), 

\\F'{xor'F{xo)\\<b, 0<g< 

Then, the sequence generated by the inexact Newton method for solving F{x) = with starting 
point xq and residual relative error tolerance 9: For /c = 0, 1, . . . , 

Xk+^ = Xk + Sk, \\F'{xo)-' [F{xk) + F'{xk)Sk] \\ < e\\F'{xor'F{xk)\\, 

is well defined. 



\F'{xor'F{xk)\\ < 



1 + 



k = 0,l, 



the sequence {x^} is contained in B{xq,X), converges to a point x^ G i?[a;o,t*] which is the unique 
zero of F in B{xq,1/L) where 



_ V2bL _ 1 - VI - 2L6 
A : — — - — , — — 



L 

Moreover, the sequence {x^} satisfies, for k = 0,1, 

1 + 9 L 



L 



\X^ - Zk+l\\ < 



2 1 - ^/2bL 



I X}^ 1 1 ~|~ I 



l + V2bL 
1 - V2bL 



I -^k I 



If, additionally, < 9 < (1 — ^2bL) /(5 — \/2bL) then the sequence {xk} converges Q-linearly as 
follows 

^1 + 9 29 



+ 



I X/^ I I ; k , 1 5 . . . . 



2 ■ 1-V26L 
Proof. Since the function / : [0,1 /L) — t- M, 

fit) ■.= ^t^-t + b, 

is a majorant function for F at point xo, all result follow from Theorem 1 2. 11 applied to this particular 
context. □ 
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